37 research outputs found
Geosocial Graph-Based Community Detection
We apply spectral clustering and multislice modularity optimization to a Los
Angeles Police Department field interview card data set. To detect communities
(i.e., cohesive groups of vertices), we use both geographic and social
information about stops involving street gang members in the LAPD district of
Hollenbeck. We then compare the algorithmically detected communities with known
gang identifications and argue that discrepancies are due to sparsity of social
connections in the data as well as complex underlying sociological factors that
blur distinctions between communities.Comment: 5 pages, 4 figures Workshop paper for the IEEE International
Conference on Data Mining 2012: Workshop on Social Media Analysis and Minin
A Method Based on Total Variation for Network Modularity Optimization using the MBO Scheme
The study of network structure is pervasive in sociology, biology, computer
science, and many other disciplines. One of the most important areas of network
science is the algorithmic detection of cohesive groups of nodes called
"communities". One popular approach to find communities is to maximize a
quality function known as {\em modularity} to achieve some sort of optimal
clustering of nodes. In this paper, we interpret the modularity function from a
novel perspective: we reformulate modularity optimization as a minimization
problem of an energy functional that consists of a total variation term and an
balance term. By employing numerical techniques from image processing
and compressive sensing -- such as convex splitting and the
Merriman-Bence-Osher (MBO) scheme -- we develop a variational algorithm for the
minimization problem. We present our computational results using both synthetic
benchmark networks and real data.Comment: 23 page
Multislice Modularity Optimization in Community Detection and Image Segmentation
Because networks can be used to represent many complex systems, they have
attracted considerable attention in physics, computer science, sociology, and
many other disciplines. One of the most important areas of network science is
the algorithmic detection of cohesive groups (i.e., "communities") of nodes. In
this paper, we algorithmically detect communities in social networks and image
data by optimizing multislice modularity. A key advantage of modularity
optimization is that it does not require prior knowledge of the number or sizes
of communities, and it is capable of finding network partitions that are
composed of communities of different sizes. By optimizing multislice modularity
and subsequently calculating diagnostics on the resulting network partitions,
it is thereby possible to obtain information about network structure across
multiple system scales. We illustrate this method on data from both social
networks and images, and we find that optimization of multislice modularity
performs well on these two tasks without the need for extensive
problem-specific adaptation. However, improving the computational speed of this
method remains a challenging open problem.Comment: 3 pages, 2 figures, to appear in IEEE International Conference on
Data Mining PhD forum conference proceeding
An Incremental Reseeding Strategy for Clustering
In this work we propose a simple and easily parallelizable algorithm for multiway graph partitioning. The algorithm alternates between three basic components: diffusing seed vertices over the graph, thresholding the diffused seeds, and then randomly reseeding the thresholded clusters. We demonstrate experimentally that the proper combination of these ingredients leads to an algorithm that achieves state-of-the-art performance in terms of cluster purity on standard benchmarks datasets. Moreover, the algorithm runs an order of magnitude faster than the other algorithms that achieve comparable results in terms of accuracy. We also describe a coarsen, cluster and refine approach similar to GRACLUS and METIS that removes an additional order of magnitude from the runtime of our algorithm while still maintaining competitive accuracy
Wide Neural Networks Forget Less Catastrophically
A primary focus area in continual learning research is alleviating the
"catastrophic forgetting" problem in neural networks by designing new
algorithms that are more robust to the distribution shifts. While the recent
progress in continual learning literature is encouraging, our understanding of
what properties of neural networks contribute to catastrophic forgetting is
still limited. To address this, instead of focusing on continual learning
algorithms, in this work, we focus on the model itself and study the impact of
"width" of the neural network architecture on catastrophic forgetting, and show
that width has a surprisingly significant effect on forgetting. To explain this
effect, we study the learning dynamics of the network from various perspectives
such as gradient orthogonality, sparsity, and lazy training regime. We provide
potential explanations that are consistent with the empirical results across
different architectures and continual learning benchmarks.Comment: ICML 202
Upconversion Luminescence and Magnetic Turning of NaLuF 4
Fluorescent and magnetic bifunctional NaLuF4:Yb3+/Tm3+/Gd3+ nanocrystals were synthesized by the solvothermal method and subsequent surface modification. By changing the doping concentration of Gd3+, the shape, size, luminescent properties, and magnetic properties of the nanoparticles can be modulated. These NaLuF4:Yb3+/Tm3+/Gd3+ nanocrystals present efficient blue upconversion fluorescence and excellent paramagnetic property at room temperature. Based on the luminescence resonance energy transfer (LRET), upconversion nanoparticles (UCNPs) were confirmed to be an efficient fluorescent nanoprobe for detecting acriflavine. It is easy to derive the concentration of acriflavine from the Integral Intensity Ratio of Green (emission from acriflavine) to Blue (emission from UCNPs) fluorescent signals. Based on this upconversion fluorescent nanoprobe, the detection limit of acriflavine can reach up to 0.32 μg/mL
Plex: Towards Reliability using Pretrained Large Model Extensions
A recent trend in artificial intelligence is the use of pretrained models for
language and vision tasks, which have achieved extraordinary performance but
also puzzling failures. Probing these models' abilities in diverse ways is
therefore critical to the field. In this paper, we explore the reliability of
models, where we define a reliable model as one that not only achieves strong
predictive performance but also performs well consistently over many
decision-making tasks involving uncertainty (e.g., selective prediction, open
set recognition), robust generalization (e.g., accuracy and proper scoring
rules such as log-likelihood on in- and out-of-distribution datasets), and
adaptation (e.g., active learning, few-shot uncertainty). We devise 10 types of
tasks over 40 datasets in order to evaluate different aspects of reliability on
both vision and language domains. To improve reliability, we developed ViT-Plex
and T5-Plex, pretrained large model extensions for vision and language
modalities, respectively. Plex greatly improves the state-of-the-art across
reliability tasks, and simplifies the traditional protocol as it improves the
out-of-the-box performance and does not require designing scores or tuning the
model for each task. We demonstrate scaling effects over model sizes up to 1B
parameters and pretraining dataset sizes up to 4B examples. We also demonstrate
Plex's capabilities on challenging tasks including zero-shot open set
recognition, active learning, and uncertainty in conversational language
understanding.Comment: Code available at https://goo.gle/plex-cod